-
Notifications
You must be signed in to change notification settings - Fork 7
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Retry api calls on 'service unavailable' #266
base: stable/rocky-m3
Are you sure you want to change the base?
Conversation
The keystoneauth1 adapter used as the basis for cinder, glance, and neutron api calls already support to retry on a 503 status call, if the corresponding parameter is passed. Currently, only connection failures are retried, but if the service is behind a load-balancer, that is rather unlikely and instead a Service Unavailable error would be raised. Change-Id: I82cf1d6eecad1262841c49e10d30c1ec5ba26f80
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Our admin/internal APIs don't go via loadbalancer, so it's more likely that connection errors occur.
Why did you opt for taking the same setting for different retries? We wouldn't be able to disable one, if it becomes problematic.
By default, this only retries 503, which is probably fine for requests changing anything in Cinder, but for GET-requests, we might also want to retry on 500 - e.g. the DB restarting and thus requests failing with "internal server error". What do you think?
The change is general and not specific to our situation. But agreed, then it doesn't help us much.
The option is called
We still can disable retries, just not separately. I would say that is good enough for fixing something which requires a fix.
The retries are enabled for all requests including POST,PUT,etc. I rather would not like to retry that on the lowest level except for 503, as the APIs are not guaranteed to be idempotent. E.g. fix the retry logic in the application-db api (oslo.db) to handle the restart better (more likely in a short-time frame) or fixing it on the db side that the API is zero-downtime (quite a bit of effort). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sounds good. Thank you for the explanations.
The keystoneauth1 adapter used as the basis for cinder, glance,
and neutron api calls already support to retry on a 503 status call,
if the corresponding parameter is passed.
Currently, only connection failures are retried, but if the service
is behind a load-balancer, that is rather unlikely and instead a
Service Unavailable error would be raised.
Change-Id: I82cf1d6eecad1262841c49e10d30c1ec5ba26f80